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Abstract 

The Frechet distance is a similarity measure between two curves A and B: Infor- 
mally, it is the minimum length of a leash required to connect a dog, constrained to 
be on A, and its owner, constrained to be on B, as they walk without backtracking 
along their respective curves from one endpoint to the other. The advantage of this 
measure on other measures such as the Hausdorff distance is that it takes into account 
the ordering of the points along the curves. 

The discrete Frechet distance replaces the dog and its owner by a pair of frogs 
that can only reside on n and m specific pebbles on the curves A and B, respectively 
These frogs hop from a pebble to the next without backtracking. The discrete Frechet 
distance can be computed by a rather straightforward quadratic dynamic programming 
algorithm. However, despite a considerable amount of work on this problem and its 
variations, there is no subquadratic algorithm known, even for approximation versions 
of the problem. 

In this paper we present a subquadratic algorithm for computing the discrete Frechet 
distance between two sequences of points in the plane, of respective lengths m < n. The 
' mn log log n x 



algorithm runs in O time and uses 0(n + m) storage. Our approach 

uses the geometry of the problem in a subtle way to encode legal positions of the frogs 
as states of a finite automata. 
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1 Introduction 



Problem statement. Let A = (a\, . . . , a m ) and B = (pi, ... , b n ) be two sequences of m 
and n points, respectively, in the plane. The discrete Frechet distance 5dF(A,B) between 
A and B is defined as follows. Fix a distance 5 > and consider the Cartesian product 
A x B as the vertex set of a directed graph G$ whose edge set is 



here we consider the case where || ■ || is the Euclidean norm. Then 5dF(A, B) is the smallest 
5 > for which (a m ,b n ) is reachable from (ai,6i) in G$. Informally, think of A and B as 
two sequences of stepping stones, and of two frogs, the A-hog and the -B-frog, where the 
A-frog has to visit all the A-stones in order, and the U-frog has to visit all the l?-stones in 
order. The frogs are connected to each other by a rope of length 5, and are initially placed 
at a\ and b±, respectively. At each move, exactly one of the frogs can jump from its current 
stone to the next one, which can be done if and only if its distances to the other frog, before 
and after the jump, are both at most 5 (see Figure [2] for an example of a possible sequence 
of jumps of the two frogs). Then 5dF(A, B) is the smallest 5 > for which there exists 
a sequence of jumps that gets the frogs to a m and b n , respectively. (Note that the frogs 
cannot backtrack.) 

Remark. In this formulation we forbid the frogs to jump simultaneously, from a placement 
(a>i,bj) to (di + i,bj + i). However, our algorithm can be modified so that it also applies to 
the variant where such "diagonal" moves are also allowed. (See a remark in Section [2. li ) 

The continuous Frechet distance. The discrete Frechet distance problem is a variant 
of the (more standard, continuous) Frechet distance problem. Informally, consider a person 
and a dog connected by a leash, each walking along a path (curve) from its starting point 
to its end point. Both are allowed to control their speed, but they cannot backtrack. The 
Frechet distance between the two curves is the minimal length of a leash that is sufficient 
for traversing both curves in this manner. 

More formally, a curve / C M. is a continuous mapping from [0, 1] to M 2 . A reparam- 
eterization is a continuous nondecreasing surjection a : [0, 1] — > [0, 1], such that a(0) = 
and a(l) = 1. The Frechet distance 5f(/, g) between two curves / and g is then defined as 
follows: 



where || • || is the underlying norm (typically, the Euclidean norm), and a and (3 are repa- 
rameterizations of [0, 1]. 

The semi-continous Frechet distance. One may also consider a hybrid version of the 
problem, of a person walking a frog. Formally, we have a curve / and a sequence B of n 
stepping stones. We want to find the smallest 5 > for which / can be partitioned into n 




$f( f-, q) = inf max 
u ' y; a,0te[O,l] 



{\\f(a(t))-g((3(t))\\} 
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(pairwise openly disjoint) arcs fi, ■ ■ ■ , f n so that the distance of bi from every point of fa 
is at most 5, for i = 1, . . . , n. In this setup, for each i = 1, . . . , n, the person walks along 
/i from its starting point to its end point, while the frog sits at 6j. Then, when the person 
reaches the endpoint, the frog jumps to frj+i, and they keep moving this way until all of / 
and B are traversed. 

Remark. All three variants of the Frechet distance can be extended, in an obvious man- 
ner, to any dimension d > 2, but in this paper we only consider the planar case. 

Background. Motivated by a variety of applications, the Frechet distance has been stud- 
ied extensively in computational geometry for the past 20 years, as a useful measure for 
the similarity between curves [9]. If data is uniformly sampled, which is often the case 
in practice, it suffices to compute the discrete Frechet distance between the sequences of 
vertices of the two curves. The extended model that also allows diagonal moves (as in a 
preceding remark) can potentially allow us to sample more sparsely along relatively straight 
portions of the curves. 

Eiter and Mannila |14j showed that the discrete Frechet distance in the plane can be 
computed in quadratic time (that is, in 0(mn) time). Later, Aronov et al. [5] have given a 
(1 + ^-approximation algorithm which solves the discrete Frechet distance problem between 
the vertices of two backbone curves in near linear time. Backbone curves are required to 
have edges whose lengths are close to 1, and a constant lower bound on the minimal distance 
between any pair of vertices; they model, e.g., the backbone chains of proteins. Concerning 
the continuous Frechet distance problem, Alt and Godau [3] have shown that the Frechet 
distance of two polygonal curves with a total of n edges in the plane can be computed in 
0(n 2 log n) time. A lower bound of O(nlogn) time for the decision version of the problem, 
where the task is to decide whether the Frechet distance between two curves is smaller 
than or equal to a given value, was given by Buchin et al. [5]. They also showed that 
this bound holds for the discrete version of the problem as well. It has been an open 
problem to compute (exactly) the continuous or discrete Frechet distance in subquadratic 
time. Even the simpler variant, in which we only want to solve the decision version of the 
discrete Frechet distance problem in the plane in subquadratic time has still been open. 
In fact, only a few years ago, Alt [2] has conjectured that the decision subproblem of the 
(continuous) Frechet distance problem is 3SUM-hard [15]. 

We note that it is also an open problem to solve the approximation versions of the Frechet 
distance problems in subquadratic time. That is, no subquadratic algorithm (in m and n, 
with any reasonable dependence on e) is known for computing a (1 + e)-approximation of 
either variant of the Frechet distance (for arbitrary curves / sequences, with no restrictions 
on their shape). 

To date, the only subquadratic algorithms known for the Frechet distance problem 
(either continuous or discrete) are for restricted classes of curves, such as the algorithm of 
Aronov et al. [5] mentioned above. Other classes of curves considered so far in the literature 
include closed convex curves and K-bounded curves [3J. A curve is K-bounded if, for any 
pair of points a, b on the curve, the portion of the curve between a and b is contained in 
D(a, §||a — U D(b,^\\a — b\\), where D(p,r) denotes the disk of radius r centered at 
p. Alt et al. [3] showed that the Frechet distance between two convex curves equals their 
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Hausdorff distance, and that the Frechet distance between two K-bounded curves is at most 
(1 + k) times their Hausdorff distance, and thus an 0(n log n) algorithm for computing or 
approximating the Hausdorff distance (as given in [2]) can be applied to obtain an efficient 
exact solution in the convex case or a constant-factor approximation in the K-bounded case. 
Later, Driemel et al. [T3] provided a (1 4- e)-approximation algorithm for c-packed curves 
in M. d that runs in 0(cn/s + cn log n) time, where a curve it is called c-packed if the total 
length of 7r inside any ball is bounded by c times the radius of the ball. 

Another variant of the Frechet distance is the weak Frechet distance, which, in the 
person-dog scenario, allows the person and the dog to also walk backwards. Recently, Har- 
Peled and Raichel [16] gave a quadratic algorithm for computing (a generalization of) the 
weak Frechet distance between curves. More specifically, given two simplicial complexes in 

and start and end vertices in each complex, they show how to compute two curves in 
these complexes that connect the corresponding start and end points, such that the weak 
Frechet distance between these curves is minimized. Since a polygonal curve is a simplicial 
complex, this can be viewed as a generalization of the regular notion of the weak Frechet 
distance between curves. 

See also |10|. [12] for a few additional results on the Frechet distance. 

Our results. We present a new algorithm for computing the discrete Frechet distance 
whose running time is 0(mn log log n/ log n) (assuming m < n). We first present a pro- 
cedure for solving the decision version of the problem: Given 5 > 0, determine whether 
the discrete Frechet distance between A and B is < 5. The decision procedure runs in 
0(mn log log n/ log 2 n) time and uses 0(m + n) space. To obtain a solution for the opti- 
mization problem, we combine the decision procedure with a relatively simple explicit binary 
search, based on a simple procedure for distance selection pp. This increases the total run- 
ning time by only a factor of 0(log n), so the overall algorithm runs in 0(mn log log nj log n) 
time, which is still subquadratic. Using (a variant of) the procedure in [T], the space re- 
quired by the optimization algorithm remains linear in m + n. The following presentation 
is therefore mainly focused on the decision procedure, which is the more involved part of 
our algorithm. 

Although not detailed in this abstract, our technique can be extended so as to compute, 
within the same time bound, (i) the discrete Frechet distance between two sequences of 
points in for any d > 3, and (ii) the semi-continuous Frechet distance between a sequence 
of points and a curve in the plane. (We do not have at the moment a similar extension 
to the continuous Frechet distance, which is one of the main open problems raised by our 
work.) 

A brief sketch of the decision procedure. Let us first provide a brief description of 
the decision procedure for a given 5 > 0. We begin by presenting a slightly less efficient but 
considerably simpler solution, on which we will then build our improved solution. Consider 
the following 0/1 matrix M, whose rows (resp., columns) correspond to the points of A 
(resp., of B). An entry Mjj of M is equal to 1 if the pair (aj,6j) is reachable from the 
starting placement (ai,b\) of the trip with a "leash" of length 5. Otherwise, Mjj is equal 
to 0. In other words, Mj , = 1 if the discrete Frechet distance 5dF between the two prefix 
subsequences (ai,...,Oi) and (bi,...,bj) is at most 5, and Mjj = otherwise. Thus 
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determining the value of M m>n solves the overall decision problem. 

M m . n can be obtained by computing all entries of M using dynamic programming, as 
follows. If ||ai — 6i|| < 5, we set M^i := 1; otherwise, M^i := and the decision procedure 
is aborted right away, since 5 is too small even for the initial placement. The other elements 
of the first row of M are then filled in order. Specifically, for each 1 < j < n we set M±j := 1 
if (a) Mij_i = 1, and (b) ||ai — bj|| < 5; otherwise we set Mij := 0. (Clearly, if Mij = 0, 
for some < j < n, then all the subsequent entries of the first row are also zero.) Similarly, 
the first column of M is filled in by setting, for each 1 < i < m in order, M^i := 1 if (a) 
Mj_i 5 i = 1, and (b) ||a, — bi|| < 5; otherwise, we set M^i := 0. For an arbitrary entry, 
1 < i < m, 1 < j < n, we set Mj := 1 if (a) at least one of Mjj_i and Mj_ij is 1, and (b) 

— bj || < 5; otherwise, we set Mj j := 0. The cost of this dynamic programming procedure 
is 0(mn). 

To obtain a subquadratic decision procedure, we cannot compute each value of M ex- 
plicitly, and instead we only compute certain rows and columns of M. To be more precise, 
we partition A into / = 0(m/ log 2 n) layers A±, . . . , Ai, each of length c\ log 2 n, where c\ > 
is an appropriate constant such that the last point of any layer Ai is the first point of the 
next layer Ai + \. We can think of this as a partition of M into / "horizontal" strips, each 
of width ci log 2 n, such that the last row of a strip is the first row of the next strip. (See 
Figure Q] for an illustration.) We then compute, for each strip (in order), the entries of M 
in the last row of the strip, and we use the values of this row as input for the processing of 
the next strip. 
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Figure 1: A partition of M into horizontal strips (and substrips), which correspond to layers (and 
blocks) of A. B is partitioned into subsequences of length r. Each subsequence of B corresponds to a 
single symbol Sj which the automata tC* process. 



To obtain the running time bound claimed above, we need to compute the entries of M in 
each of the l+l "boundary" rows (including the first and the last rows) in 0(n log log n) time. 
To do so, we further partition each layer Ai into t = 0(logre) blocks, of length C2 logn each, 
where C2 > is a sufficiently small constant to be specified later (or, alternatively, partition 
each strip of M into B(logn) substrips, of width C2logn each). As before, the last point 



5 



of a block is the first point of the next block. We handle each block in 0(n log log nj log n) 
time, using an approach that resembles the execution of a deterministic finite automaton 
IC* . Somewhat informally, the automaton is constructed from the corresponding block of 
A, and we execute it on a string constructed from the elements of B. To achieve the 
desired running time (in particular, to avoid having to spend 0(n) time in "reading" the 
individual elements of B), we partition B into \i = 0(ra log log n/ log n) subsequences of 
length r = C3 log n/ log log n each, where C3 > is yet another constant, and require JC* to 
operate on each subsequence, in constant time, as if it were a single symbol. 

We note that the compaction of M outlined above is similar to compactions used to 
solve several related problems. For instance, Baran et al. [6] present an o(n 2 ) algorithm for 
the 3SUM problem on integers of bounded length. (See also algorithms for the edit distance 
problem; \17 \ [19]). However, while the other compactions are purely symbolic, ours is 
strongly based on the geometry of the problem. A major difference between our algorithm 
and the other ones is that in our case the input of the problem in itself does not include 
repetitions (that can be used in the compaction). That is, the input points are not likely 
to repeat themselves. We create repetitions artificially by constructing the arrangement A 
of the disks centered at the points of A, and locating the points of B in this arrangement. 
Now, the faces of A that contain the points of B generally repeat themselves. The finite- 
state automaton IC* that we construct operates on the faces of A rather than on the points 
of B, and this leads to the desired subquadratic performance. Using such an automaton for 
the compaction appears to be also a novel technique. 

Organization. In Section [2] we describe the decision procedure in detail. In particular, in 
Section f2.il we show how to deal (slightly less efficiently) with a single block of A. We then 
show, in Section [2.21 how to handle a layer of A, which contains G(logn) such blocks, by 
combining portions of the processing of the separate blocks into a common procedure that is 
executed at the layer level. Finally, in Section [273] we describe the overall decision procedure. 
(The justification of using blocks of size 0(logn) is deferred to Section [4] where we present 
a lower-bound construction that indicates that using blocks of larger size may cause our 
respective automata IC* to be too large for a subquadratic algorithm.) In Section [3] we 
show how to combine the decision procedure with an elementary binary search, and obtain 
the main result of this paper, namely a subquadratic algorithm for computing the discrete 
Frechet distance. 

2 The decision procedure 

In this section we focus on the decision problem: Given 5 > 0, determine whether 5dF{A, B) < 
5. By an appropriate scaling, we may assume, without loss of generality, that 5 = 1. 

As mentioned in the introduction, we partition A into / = 0(m/log 2 n) layers, of size 
ci log 2 n each (where c\ > is an appropriate constant whose value will be set later), such 
that the last point of each layer is the first point of the next layer, and process them in 
order. To process a single layer of A, we further partition it into t = 0(logn) blocks, of 
size C2logn each (where C2 > is a sufficiently small constant, also to be specified later), 
such that the last point of each block is the first point of the next block. The algorithm 
processes the blocks within a layer one by one in order. The purpose of processing a layer is 
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to collect, in a single processing step, information that will be needed by each of its blocks. 
Then each block is processed separately, in order, and the M-entries of its terminal row are 
computed from those in the initial row, all the way to the terminal row of the entire layer. 

2.1 Handling a single block of A 

Here are the details of processing a single block. To simplify the notation, we denote the 
block by A; its size m now satisfies m = C2 logn (the very last block of the entire sequence 
may be smaller). Enumerate the points of A as a\, ... , a m . 

Regard the points oi, . . . , a m as the centers of respective unit disks Di,..., D m , and let 
T> denote the sequence of these disks. Consider the arrangement A = A(D) of the disks, 
and associate with each face / of A the subset T>f of disks containing /. For each point 
bi G B, denote by fi the face of A containing bi. 



Remark. The description given in this subsection provides the essential ingredients of the 
processing of a block, but is somewhat lax or vague about precise implementation details, 
which have to be applied with care to ensure the running time we are after. For example, a 
naive implementation of the step that finds the faces fi, by n point locations of the points 
of B in A, is too expensive for our purpose. The layers are used to conglomerate some parts 
of the processing within their blocks into a single processing step, thereby improving the 
efficiency of the procedure. More details are provided in the next subsection. 

Fix two indices 1 < i < j < n, and call the pair ^(cti, bi), (a m , bj)J valid if there exists a 

path in G$ (G\, that is) from (a±, bi) to (a m , bj). We can simulate such a path as a sequence 
of moves between basic states, where each basic state is a pair (/, D^), where / is a face of 
A and Dk is a disk in T>j. In each move we either pass from (/, D^) to (/', D^), where /' is 
another face of A which is also contained in D^, or pass from (/, D^) to (/, Dk+i), if -Dfc+i 
also belongs to T>f (i.e., also contains /). See Figure [2 In the original problem (involving 
the complete unpartitioned A) we would have to start at (f\,D\) and to reach (f n ,D m ) 
(now with m equal to the original size of A) , using a sequence of legal moves between basic 
states, of the types just described, that corresponds to a path in G\ from (ai,b\) to (a m , b n ). 
(For this, though, we would need to construct the huge arrangement of the disks for the 
entire sequence A, which would have been far too expensive.) In the refined version we 
start at (fi, D\) and have to reach (fj,D m ) along a similar sequence of moves, for arbitrary 
indices i < j (and for the much smaller size m of a block). This represents the situation 
where the portion of the trip of the B-hog that corresponds to the passage of the A-frog 
through the points of the present block A starts at bi and ends at bj . 

Note that, in view of this interpretation, we are only interested in placements (a\,bi) 
that can be reached (through the preceding blocks of the complete A-sequence) from the 
starting placement of the whole trip. We refer to such a placement as a reachable position 
of the frogs. Let the flag (fi = <p(bi) indicate whether the placement (01,64) is reachable 
through the preceding blocks of A (in which case (fi = 1), or not (ifi = 0); in this notation 
we hide the dependence of ifi on the preceding layers and blocks. Note that if A is the first 
block, then q>i = for each i > 1, since (ai,6j) is not reachable through the (empty set 
of) preceding blocks. For i = 1 we set cpx = 1 if 61 G D\, and otherwise abort the entire 
procedure, since the frogs lie at their starting positions at distance > 1. 
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Figure 2: An illustration of the decision problem of the discrete Frechet distance. The stepping stones of 
the A-frog are the black points. The disks (of radius 5) centered at the points of A form the arrangement 
A. The stepping stones of the .B-frog are the hollow points. In this example, a legal path of the two 
frogs is ((ai,&i), (02,61), (02,62), (02,63), (03,63), (04,63)). 



We can store the data maintained by this process in more compact form. To do so, we 
define an aggregate state (to which we refer as a state for short) to be a pair (/, Sf), where 
Sf is a subset of Vf, we refer to Sf as the set of valid disks (associated with our state). 
The set Sf is assumed to have the property, dictated by the transition rules for the frogs, 
that if Dk € Sf and Dk+\ € T>f then Dk+i also belongs to Sf. 

A state (/, Sf) and a pair (g, (p), where g is a face of A, and <p> is a binary flag, determine 
a transition into a new state (g,S g ), where S g C V g consists of those disks Dk £ T> g for 
which there exists j < k such that (i) Dj € Sf, and (ii) the entire run Dj,Dj + \, . . . ,Dk is 
contained in V g . Furthermore, if ip = 1, then S g also contains the maximal prefix of disks in 
T> (starting with D\) that is contained in T> g . The passage from (f,Sf) to (g,S g ) is called 
a valid transition. 

The interpretation of this setup is as follows. The state (/, Sf) signifies that (a) the 
.B-frog is now at a point that belongs to /, and the A-irog lies at the center of some 
disk Dk 6 Sf, and (b) this position has been reached via a legal sequence of interweaving 
A-moves and .B-moves, starting from (01, 61) (if A is the first block of the whole sequence), 
or from some placement {a\, bi) (if A is an intermediate block), which is reachable from the 
starting positions of the frogs (so <pi is 1). Moreover, for the specific sequences of stepping 
stones for the A-frog and the -B-frog, the A-frog cannot lie at the center a/% of any disk 
D k i Sf. 

The valid transition from (f,Sf) to (g,S g ) means that, for any disk Dk G S g , we can 
get the A- frog to lie at its center a/%, and get the .B-frog to lie in g, by taking a disk Dj as 
in the definition of the valid transition, assuming that the A- frog lies at dj and the .B-frog 
lies in / (in accordance with the above interpretation of (f,Sf)), moving the .B-frog to g 
(which is possible since Dj also belongs to T> g ), and then moving the A- frog through the 
centers Oy+i, ... ,a,k, all at distance at most 1 from the .B-frog (or, if j = k, let the A-iiog 
stay put). Moreover, if the last move of the .B-frog is from / to g, and the A- frog lies at the 
center of some disk in Sf, then the centers of the disks in S g are the only possible locations 
that the A- frog can reach (with this single hop of the .B-frog) . 
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In addition, the flag p> allows the .B-frog to appear "out of nowhere" in the middle of 
the first row of the block, in case a position {a\,bi), where 6j € g, is reachable from the 
starting placement of the whole trip. This means that we can get the A-frog to lie at a\, 
and the .B-frog to lie in g, by some path starting at the starting position of the entire trip of 
the frogs, and moves solely through the points of the preceding blocks of the full sequence 
A (once the .B-frog has reached g, the A- frog can move through the centers of the disks in 
the prefix of T> g contained in S g , and stop at any of these centers before the .B-frog makes 
its next move). 

The compression of basic states into aggregate states resembles the construction of a 
deterministic finite automaton (DFA) from a nondeterministic finite automaton (NFA). This 
is not accidental; we have already hinted that the algorithm simulates the moves of such an 
automaton, and the resemblance will become more relevant as we continue to present the 
algorithm. 

Remark: If we want to also consider the variant where the frogs are allowed to jump 
simultaneously from a placement (ai,bj) to (a.j+1, fy+i) (provided that \\ai — bj\\ < 1 and 
— bj+i || < 1), we only need to modify the above rules of a valid transition. Specifically, 
a state (/, Sf) and a pair (g, tp), where g is a face of A, and <p is a binary flag, determine a 
transition into a new state (g, S' g ), where S' g C T> g is the union of S g , as defined above, and 
of another set S g C D g , consisting of those disks £ T> g for which there exists j < k such 
that (i) Dj £ Sf, and (ii) the entire run -Dj+i, . . . ,Dk is contained in T> g . (so the disk Dj 
is not required to belong to the run). 

A DFA interpretation. We can interpret the setup just described as a construction of 
a deterministic finite automaton /C, as follows; for the convenience of the reader, we include 
the following short glossary of the main notations used in this construction. 

/ or fi (or g) - a face of A(V). 

F - string of n faces. 

Fk - a substring of F (of length r) . 

ip or ifi - a binary flag. 

<I> - string of n flags. 

&k - a substring of (of length r). 

a - a pair (/, ip) of a face / and a flag ip. 
S - string of n pairs. 

Sfc - a substring of E (containing r pairs). 

The states of K, are the aggregate states (/, Sf), where / is a face of the corresponding 
disk arrangement A and Sf C Df. The i-th 'character' in the string that K, has to process is 
the pair (gt, tpi), where gi is the face of A that contains hi, and ipi is a flag indicating whether 
(01, hi) is a reachable position of the two frogs (in the sense defined above, with respect to 
the whole trip). The transition from a state (/, Sf) on reading the pair {gi, (pi) is to (gi, S 9i ), 
where S 9i is defined as above. The string that /C has to process to handle the current block 

A is thus the string of pairs S = ( (/2, ^2), ■ ■ ■ , (fn, fn) ) , where /2, ■ ■ ■ , /n are the (not 
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necessarily neighboring) faces of A containing the corresponding actual points i»2, . . . , b n of 
the -B-sequence, and ip2,...,ip n are the respective flags associated with 62, . . . , b n , as defined 
above. 

The starting state of JC is the state (fi,SfA, where fx is the face containing 61, and 
where = if <^(i>i) = 0, or Sfa is the largest prefix of V contained in T>f 1 if (p(b±) = 1. 
Note that in the construction of JC we are not given that prefix — /C is defined in terms of A 
only. Furthermore, when JC does read the S-string £ and reaches a state (/j, SfJ it outputs 
a new flag (p(bi), which is 1 if D m 6 Sf t and is otherwise. The points 6j with y>(bj) = 1 are 
exactly those for which (a m ,bi) is reachable in G\ (from the beginning of the whole trip). 
In this context, we can think of /C as a Moore machine [21] — a finite-state transducer that 
associates an output value with each state. We can thus associate the output flag (p(bi) 
with the state (fi,SfJ. The output flag tp(bi) will be used later, as an input for the next 
block (see Section [273]) . 

As noted earlier, if A is the first block of the whole sequence, each flag of the input 
sequence £, except for the first one, is equal to zero. For the first position (ai,b\) of the 
first block, we assume that b\ 6 D\ ; otherwise, as already mentioned, we abort the decision 
procedure right away, reporting that the Frechet distance 5dF{A,B) is greater than 1. We 
thus set, after verifying this constraint, ipi = (p(b\) = 1. 

Remark. The automaton JC is constructed from the block A only, without knowing any- 
thing about the sequence B. Consequently, for each face / of the arrangement A, we need 
to prepare states (/, S 1 /) for each subset Sf C Vf that might arise via some sequence of 
stepping stones of the i?-frog. As shown in Section HI there are situations where the number 
of such feasible subsets may be exponential in [Df\ (that is, in m). This is why we need to 
take m = C2 log n, with C2 sufficiently small, to control the size of JC and the time needed 
to construct it (so that they are both sublinear in n). 

Constructing an efficient DFA. To obtain an overall procedure with subquadratic run- 
ning time, we modify the construction of JC to obtain a somewhat more efficient automaton 
JC* to handle a block A. There are two major improvements in the construction of JC* . The 
first, whose detailed description is deferred to Section [2~2"1 is to construct JC* in terms of 
the finer arrangement A* of the disks centered at all the 0(log 2 n) points of the A-sequence 
within the layer containing the current block. Informally, the reason for doing it (explained 
in detail in Section I2.2p is that it saves us the need to locate the .B-points in each of the 
coarser block arrangements, a process that would be too expensive for our purpose. Never- 
theless, so as not to throw at the reader the two improvements at the same time, we present 
here the construction of fC* solely in terms of the current block arrangement A and then 
modify it in the next subsection. 

The second improvement aims to allow JC* to process the -B-dependent string E in a 
faster manner. Specifically, we modify K, so that each input character that it reads is a 
string of C3 log n/ log log n consecutive input characters of E, where C3 > is a sufficiently 
small constant, whose value will be determined later. That is, we partition the input string 
E of K into n = B C 1 "^" ) substrings £1, E2, . • • , E^ of size r = C3 log nj log log n each; 
the last substring may be shorter. The states of JC* are the same aggregate states (/, Sf) 
of JC. When JC* is at state (/, Sf) and is given a substring E^ = ((/1, ipi), ... , (f T , <p T )), it 
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moves to state (f T , Sf T ), where (f T , Sf T ) is the state that IC would have reached from (/, Sf) 
after processing the input substring character by character. (The subscripts used in the 
enumeration of the pairs of start at 1 for the sake of simplicity. This involves a slight 
abuse of notation, because (fi,<pi) denotes here the first pair of Sfc and not the first pair 
of the entire string S.) 

Furthermore, a transition of IC* from a state (f,Sf) to a state (f T ,Sf r ) as above, pro- 
duces an output string = ((p±, . . . , <p T ), where (fj is the output of IC when it reaches 
the state (fj,Sf.) (again, under the new enumeration convention). Recall that we re- 
garded /C as a Moore machine, where the output flags ipj are associated with the corre- 
sponding states (fj,Sf ). However, here the state (f T ,Sf T ) that IC* moves to after reading 
((/i, ipi), . . . , (f T , ip T )) cannot determine by itself the output string which requires knowl- 
edge of the full sequence ((/i, ip\), . . . , (f T , ip T )) that led IC* to (f T , Sf T ). More specifically, 
the flags comprising are determined by the states (fj, Sf.) that JC traversed on the way to 
(f T , Sf T ). To avoid having to look at each intermediate state (fj, Sf 3 ) separately, we observe 
that all these states are implicitly encoded in the transition edge of IC* that takes us from 
(f,Sj) to (f T ,Sf T ) upon reading S^.. We can therefore regard IC* as a Mealy machine [20] 
- a finite-state transducer that associates an output value with each transition edge. 

The rest of the description of IC* remains the same as that of IC. 

In the following, we describe how to construct IC* so that a state transition can be 
carried out in constant time. (A full description of the construction of IC* will be given in 
the next subsection.) As is shown later, executing a transition of IC* in constant time is 
essential for obtaining the subquadratic running time of the whole optimization procedure. 

To construct IC* , we build the transition table T, according to the rules stated above. 
Since T is constructed independently of the input string S, we must prepare, for each state 
(/, Sf) of IC* , all possible transitions to a new state. That is, given a state (/, Sf) we store, 
for each possible input substring £j. of length r, the state (g,S g ) that IC moves to after 
processing (assuming that JC was in state (f,Sf) just before reading To be more 
precise, we prepare the transition table T of IC* as a collection of arrays L^f^^, one array for 
each state (/, Sf) of IC* . The array £(/,s*) of a fixed state (/, Sf) is defined so that, for each 
index j encoding a substring (details of the encoding are provided next), ,) [j] is the 
pair ((g, S g ), $fc), where (g, S g ) is the state that IC* moves to after processing (assuming 
that IC was in state (/, Sf) just before reading S^), and <3?fc is the output substring of flags 
that corresponds to this transition. 

To complete the description of T, we now describe a simple encoding scheme that 
converts each string = ((f±, (fi), ■ ■ ■ , (fr^r)) of r pairs into an integer e(Sfc) of 0(log n) 
bits. To do so, each face / of the arrangement of the disks of A is given an integer label 
e(f) in the range (0, . . . ,clog 4 n), for an appropriate absolute constant c. (We will later 
explain, as part of the full description of the construction of IC* , why we use this range and 
how to generate the labels efficiently.) Clearly, at most (3 = log(clog 4 n) = d log log n bits 
are needed for such a label, for another absolute constant c' (close to 4). We now put 



and note that e(Sfc) does indeed consist of only r(/3 + 1) = O(logn) bits. Clearly, this is a 
one-to-one encoding. 



T 



T 




(1) 



11 



With this setup, each state transition can be executed in constant time. Specifically, 
when /C* is in state (/, Sf) and is given the encoding e(Sfc) of an input substring Efc, we 
follow a pointer to the array £(/,S/) and retrieve its entry L(j [e(£fc)l m constant time. 
This gives us the next state (g,S g ) and the corresponding output bitstring Hence, the 

execution of K*, when given O ( ~ ~ 1 substrings as above, takes O f n 1 °pg° g - 1 time. 
This cost excludes the computation of the indices e(S^), which will be discussed in the next 
subsection. 

The size of (number of entries in) T is the number of states of tC*, multiplied by the 
number of possible input substrings for KL* . The latter number is 2^ +1 ) T < 2 C lo s n 5 where 
c" is proportional to C3, which we choose sufficiently small so as to have c" < 1/4, say. The 
number of states of K* is O (m 2 2 m ), where m = C2logn is the size of a blockl 1 ] There are 
0(m 2 ) faces in the disk arrangement, and, in view of the construction given in Section 
we use the pessimistic bound of 2 m on the number of possible subsets Sf for any fixed face 
/. Choosing C2 sufficiently small, we can ensure that the number of states of K* is at most 
0(n 1//4 ), say. Hence the size of T is 0{n 1 / 2 ) = 0(nloglogn/logn), and it can be built 
within the same asymptotic time bound. 

2.2 Handling a layer of A 

In order to make the whole procedure efficient, we need to construct quickly the encodings of 
the input strings for the automata of the blocks of A. Note that we cannot even afford linear 
(i.e., 0(n)) time for this preparation for each block, because this would result in the overall 
bound 0(mn/ log n) for the running time of the decision procedure, which, multiplied by 
the number O(logra) of binary search steps, would yield 0{mn) overall running time, which 
defeats our goal of obtaining a subquadratic solution. 

This is the reason for using a two-stage partitioning of A, first into layers of size ci log 2 n 
each, and then into blocks of size C2 log n each — The preparation of the strings is done 
mainly at the layer level, thereby making the cost sublinear for each block. 

Here are the details of this preprocessing step. Fix a layer A of A, which contains 
t = G(log n) blocks, of size C2 log n each, which we enumerate as Ai, . . . , At. As before, the 
last point of Ai, for 1 < i < t, is the first point of Ai + \. We process Ax, . . . , A t in order, in 
much the same way as described in Section 12.11 except that some of the preparatory steps 
are grouped together, and take place during the preprocessing of A. 

In more detail, we first construct the arrangement A = A{T>), where T> is the set 
of ci log 2 n unit disks centered at the points of A; the number of faces of A is at most 
clog 4 n, for an appropriate constant c (the same constant appearing in the encoding in the 
previous subsection). We preprocess A for efficient point location, using any of the standard 
techniques, in 0(log 4 nloglogn) time. Fix a block Aj of A, and note that each face / of A 
is a subface of a face of the arrangement Aj of the disks centered at the points of Aj. 
We find these correspondences by preprocessing each Aj for fast point location, and then, 
for each face / of A we pick an arbitrary point in / and locate it in Aj, thereby obtaining 
fU'. In this way each face of A stores t pointers to its "super-faces" for j = 1, . . . ,t. 

1 The first improvement in K,*, deferred to Section 12.21 will cause the number of states to increase to 
0(m 2 m ), which will have negligible effect on the performance of the algorithm; see below for details. 
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Next, for each point 6j of the -B-sequence, we locate the face fi of A containing 6j, 
using the point location structure. This takes 0(n log log re) time. We obtain a sequence 
F = (fi, f 2 , ■ ■ ■ , f n ) of faces of A, and we partition it into fi subsequences F\, . . . , F^, each 
consisting of t consecutive faces, where = 0(reloglogn/logre) and r = C3 log re/ log log re, 
as in the preceding subsection. 

Now comes the other improvement in the construction of the block-automata /C* con- 
sidered in Section [270 Specifically, since the number of faces of A is at most clog 4 re, we 
label each face / of A by an integer e(f) in the range (0, . . . , clog 4 re). For each of the ji 
subsequences F^ of F, say F*. = (/1, . . . , f T ), we compute the "partial" index (cf. (HJ) 

T 

e {F k ) = Y J <fi)-^ {i ~ 1)+T - (2) 

i=l 

Note that, given the labels e(/j), eo(i ? fc) can be computed by O(r) additions and multi- 
plications (or, rather, shifts). In addition, note that this index is common to all the blocks 
of A; we stress again that each such partial index is computed only once within the layer 
A. 

Now fix a block Aj of A, and consider the construction of its automaton /C-. Except for 
the fact that the faces of A that we use here are smaller than the respective faces of the 
block arrangement Aj, the states (/, Sf) and the transition rules for /C* are very similar to 
those used in Subsection 12.11 More specifically, each face /o of Aj is now the union of some 
faces of A. Every state of the form (/o, Sf ) that we had before is now copied, for each face 
/ Q fa of A, to a state (/, Sf ). A similar copying is applied to the transition rules. That is, 
consider first the non-compacted automaton ICj. If it is at a state (f,Sf) and reads a pair 
(g, (p), where / and g are now faces of A, we apply the same transition rule that the original 
ICj obeys when it is at state (/o, Sf) and reads (go, if), where /o (resp., go) is the face of Aj 
containing / (resp., g). We now obtain the new version of /C* from the new version of fCj 
in the same manner as above. That is, when /C* is at state (f,Sj) and reads a substring 
Sfe = ((/1, v?i), • • • , (/ T , <p T )) of S, where now /, fi, . . . , f T are faces of A, it moves to the 
state (f T , Sf T ) obtained by running the new ICj on the pairs of one by one. 

The total time for computing the \i indices eo(-Ffc) is linear in re. This is tolerable 
since we carry out this computation only once for the entire layer A. However, each of the 
subsequences S^. that we feed into the various block automata ICj has a second "component" 
that depends on the input flags at the first row of the respective block Aj. Specifically, each 
Sfc is of the form ((/1, (fi), ■ ■ ■ , (f T , fr)), which we can represent by the pair (F k , $&), where 
Fk = (fi, ■ ■ ■ , fr) an d = (v^i , ...,(p T ). The subsequences F^ are computed once, at the 
layer level, and do not change from block to block, but the subsequences do depend on 
the blocks. In terms of the encoding in (pQ) we have 

e(£ fe ) = e (F fc ) + eo($fc), (3) 

where 

r 

eo($ fe ) = ^^-2 i - 1 (4) 
i=i 

is simply the bitstring consisting of the flags in <!>£,. 
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We can easily construct the automata /C- in such a way that the output of each transition 
is the encoding eo(3?fc) of the corresponding sequence Assuming that this is the case, we 
process a block Aj as follows. Let <3>i, . . . , $ M denote the output flag subsequences from the 
execution of the preceding automaton /C^ —1 (or from the execution of the last automaton 
in the preceding layer, or from the initialization of the entire procedure). By assumption, 
we are actually given the encodings eo($i), • • • , eo( < & At ) (the computation of these bitstrings 
during initialization is trivial and inexpensive), and we substitute them in © to obtain 
e(Si), . . . ,e(S M ). This computation takes O(fi) = 0(n log log nj log n) time for each block, 
for a total of 0(n log log n) time for the whole layer. We now run (the modified automaton) 
K,* on the string (e(£i), . . . , e(£ M )) and obtain the output sequence eo(3>'i), eo(<£p)), 
where ^ , . . . , <& ^ are the flag subsequences output by the state transitions of /C* , which are 
the input for the next automaton. 

The analysis in Section 12.11 shows that, with an appropriate choice of the constants 
c i> c 2i C3, the construction of the automata ICj , for j = 1, . . . , t, takes a total of 0(n) time (in 
fact, much smaller if we so wish). Processing a single block costs 0(n log log n/ log n) time 
(see Section \2. II and the preceding paragraph). Since A contains G(logn) blocksjlthe total 
cost for processing A is 0(n log log n). (This includes the cost of the point location stage 
within A, which is also 0(n log log n).) In conclusion, processing a single layer, including 
the processing of each of its blocks, takes a total of 0(n log log n) time. 

The space required for this procedure is linear in n, since we need to store the subse- 
quences of faces of A, which are used as input for each JC*. The space used for handling 
a block Ai of A is sub-linear in n (see Section 12. ip , and can be freed after processing Ai . 
Hence, the total space required for processing A is still linear in n. 

2.3 The overall procedure 

To obtain an overall algorithm with subquadratic time, we partition the original sequence 
A into @(m/ log 2 n) layers A\, A2, ■ ■ each (except possibly for the last one) consisting of 
ci log 2 n points, and so that the last point of Ai is the first point of Ai + \ for each i. We 
then process A\,A2,... in succession. 

To process a layer Ai, we use the procedure of Section [2T21 If Ai is not the last layer of A, 
we use the output sequence $1, . . . , $ M of Ai as input for Ai + \ (as described in Section [2^2]) . 
Otherwise, Ai is the last layer of A, and we use the last flag tp T of the last subsequence <& M 
to determine the outcome of the decision process — if ip T = 1 we report that 5dF{A, B) < 5; 
otherwise 5dF(A, B) > 5. 

Processing each layer A\ of A takes 0(n log log n) time, so processing the 0(m/log 2 n) 
layers, takes 0(mn log log nj log 2 n) time. The space required for handling a layer Ai of A 
is linear in n (see Section l2.2p . and it can be freed after handling Ai. Hence, the space 
required by the decision procedure is only 0(n + m) (we need 0(m) space for storing ^4). 

Hence, we obtain the following intermediate result. 

Theorem 2.1. Given two sequences A, B of stepping stones, of respective sizes m and n, 
with m < n, and a parameter 5 > 0, we can decide, using O ^ " 1 "| ) °| 1 ° gra ^ time and 0(n + m) 

2 This step in the analysis is the reason for restricting the size of a layer to ©(log 2 n) points of A, that is, 
to ©(log n) blocks. 
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space, whether 5dF (A, B) < 5. 

Remark. The above procedure determines whether 5dp(A,B) > S or 5dF{A, B) < 5. In 
the latter situation, there is no need to discriminate between 5dF(A, B) < 5 and 5dF(A, B) = 
5, since this could easily be done upon termination of the binary search, as described in 
Section [31 by comparing two consecutive critical values of 5 reached at the end of the search. 
See Section [3] for more details. 

3 Solving the optimization problem 

We use the decision procedure in Section [2] to solve the optimization problem, as follows. 
First note that the critical values of 5, in which an edge is added to the graph Gs (as 5 
increases), are the pairwise distances between a point of A and a point of B. Hence, it 
suffices to perform a binary search over all possible run such distances, and execute the 
decision procedure in each step of the search. At each such step, the corresponding pairwise 
distance is the l-th smallest pairwise distance in A x B for some value of I. We can find 
this distance, e.g., using a variant of one of the algorithms of Agarwal et al. [I], which 
runs in time close to 0(ra 3 / 2 ). This algorithm can easily be adapted to the "bichromatic" 
scenario, where we consider only distances between the pairs in A x B (as opposed to 
finding distances between the points of a single set). More specifically, we use a variant of 
the simpler (sequential) decision procedure of [I]. We partition the set A into [m/n 1//2 ] 
smaller subsets, each of size at most ra 1 / 2 , and operate on each subset independently, coupled 
with the whole B. In processing such a subset A4, we construct the arrangement of the disks 
of radius 5 centered at the points of A4, and locate the points of B in this arrangement, 
exactly as in pp. Altogether, this yields the number of pairs in A x B at distance at 
most 5, which is what the decision procedure needs. The overall cost of this procedure is 
0(n 3 / 2 logn). Finally, we solve the optimization version of the distance selection algorithm 
using parametric searching, increasing the running time to 0(n 3//2 log 3 n). This running 
time is subsumed by the cost of the decision procedure of Section [2j 3 l 

Since we call the decision procedure O(logn) times during the search, we obtain the 
following main result of the paper. 

Theorem 3.1. Given two sequences A, B of stepping stones, of respective sizes m and n, 

( inn log log 

with m < n, we can compute the discrete Frechet distance between A and B inO I 



time and 0(n + m) space. 

4 An exponential lower bound on the number of states 

An interesting question that pops up right away in the design of the algorithm is how 
large can JC* be. That is, how many aggregate states (and transition rules) can one have. 

3 Although there are more efficient algorithms for distance selection, which run in close to 0(n 4 ^ 3 ) time [l] 
I18| . this simple-minded solution suffices for our purpose, and it has the advantage that it only uses linear 
storage. 
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Unfortunately, the following construction shows that this number can be exponential in m 
in the worst case. 

The construction, depicted in Figure [3l uses an even number of disks; with a slight 
abuse of notation, we denote their number by 2m. Enumerate the disks as D\, D 2 , ■ ■ . , D 2m 
and their respective centers as a±, a 2 , . . . , a 2m . All these centers lie on the x-axis in the 
right-to- left order a\, CI3, . . . , a 2m -i, «2, 0,4, . . . , a 2m . The centers of the even- indexed disks 
(red disks for short) are sufficiently close to each other, so that these disks have a large 
common intersection. The odd-indexed disks (blue disks for short) are placed so that, for 
each k = 1, . . . ,m, D 2 k-i intersects D 2 k (in a small cap) but is disjoint from D 2 k+2 (the 
second condition is vacuous for k = m). 




Figure 3: A configuration of disks with an exponential number of states. The red disks are drawn solid 
and the blue disks are drawn dashed. 

We next place 2m + 1 points b\, 6' l5 b 2 , b' 2 , ■ ■ ■ , b m , b' m , 6 m +i (or, rather, select 2m + 1 
corresponding faces fx, f[ , f 2 , f 2 , . . . , f m , f m , f m +i of the resulting arrangement of the 2m 
disks). For each i = 1, . . . , m, we take /j to be the cap D 2 i-\ n D 2 i (by construction, and as 
shown in the figure, these are indeed faces of the arrangement). We take f[ to be the face 
lying directly above fi, so that in order to go from /j to f[ we need to exit the two disks 
D 2 i-\ and D 2 i (and not to cross the boundary of any other disk). Finally, we take f m +\ to 
be the intersection face of all the red (even- indexed) disks. 

We regard {a\,b\) as the starting position of the frogs, where b\ is any point in f\ and 
di is the center of Di, and the goal position is (a 2m , 6 m +i), where b m+ i is any point in f m +i 
and a 2m is the center of D 2m . 

By construction, T>f ■ l consists of all the m red disks. We claim that for every subset 
S C Vf m+1 , (fm+i, S) is a valid state, obtaining the asserted exponential number of states. 
To be more precise, the claim is that for any such S we can construct a sequence B = B$ of 
points, which (i) starts at bi and ends at 6 m +i, (ii) contains all the points b\,b 2 ,..., b m +i (in 
this order), and (hi) contains some of the points b'^, . . . , b' m , so that if it contains b'j then b'j 
appears between bj and bj+\. The sequence B$ has the property that for any D G S, as the 
-B-frog moves through the sequence B$, the A- frog can execute a sequence of corresponding 
moves, so that it reaches at the end the center of D, and this cannot be achieved (for the 
same sequence Bs) for any D S. For simplicity, we only specify the sequence of faces of A 
containing the points of B, rather than the points themselves (although the figure depicts 
the points too). 

So let S C T>f m+1 be given. We associate with 5 the following sequence F$ of faces. We 
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start with the subsequence (/i, /2, . . . , f m , f m +i) and, for each Dik not in S, we insert f' k 
into Fg, between fk and fk+i- Figuratively, the corresponding sequence Bg, which proceeds 
from right to left, is a mixture of sharp vertical detours (corresponding to red disks not in 
S) and of short horizontal moves (for red disks in S). 

We next argue that Fg does indeed generate the state (f m+ i,S). Consider a red disk 
D2k not in S. When the B-frog follows the detour from fk to f' k and then to fk+i, it leaves 
£>2fc-i an d F>2k and then re-enters Z?2fc (and -D2fc+2)- The maximal run of disks which ends 
at D2k and is contained in T>f k+1 , includes D2k only, since F>2k-\ ^ "Dfk+i- ^ n addition, D2k 
does not belong to 'Df, so in particular D 2 k £ Sf. Hence, D 2 k $ Sfk+f because there is 
no valid transition (in this setup) from (/£, <S/') to (fk+i, Sf k+1 ) such that I?2fc £ &fk+i ( see 
the rules for a valid transition in Section [2. ip . From this point on, the path is fully outside 
of -D2/C-I) so ) as easily verified by induction, Z?2fc will not appear in any of the following 
states, including the state (/ m +i, <Sy m+1 ), as claimed. (The reader might wish to interpret 
this argument in terms of the actual moves of the frogs.) 

Consider next a red disk Z?2fc that belongs to S. It suffices to show that when the B-fcog 
reaches fk, the A- frog could have executed a sequence of preceding moves that gets it to 
the center of D2k', this is because, from this point on, the S-frog remains inside D2k (note 
that, by construction, we do not execute the detour via f' k ), so the ^4-frog simply has to 
stay put at the center of D 2 k and wait for the end of the sequence of moves of the l?-frog. 

Note that fi is contained in all blue disks and in D2. In particular, this implies the 
asserted property for k = 1: The A- frog goes from the center of D\ to the center of D2 
before the 5-frog moves, and stays there till the end. In general, f~ is contained in the 
blue disks D<ij-\, F>2j+\, • • • , F>2m-\ and in the red disks D2, -D4, . . . , F>ij- What the ^4-frog 
needs to do is to ensure that, for each j < k, it lies at the center of -D2.7+1 by the time the 
-B-frog gets to fj. This is easily argued by induction on j. The A- frog can do this for j = 1, 
because f\ lies in D\, D2, -D3. For larger values of j, assume that the A-fiog is at the center 
of D2j-\ when the i?-frog is at fj-i- If the path goes straight to fj, it exits -D2.J-3 and 
then enters F>2j- Since the A- frog is at the center of T>2j-\, it can now move to the center 
of Z?2j and then to the center of Z^'+i, as desired. If the path goes to fj via it exits 
Z?2j-3 and Z?2j-2i then re-enters -D2J-2 and then enters F>2j- However, since the ^4-frog is 
already at the center of D2j-\, these additional exit and re-entry are irrelevant for it, and 
it can now move to the center of Ftij+x as above. Finally, when the i?-frog moves to fk, the 
A-frog, which is now at the center of F>2k-\, moves to the center of Z?2fc and stays there. 
This completes the argument. 

Remark. It is a challenging open problem to circumvent this exponential lower bound 
on the number of possible states. Of course, we have exponentially many states because 
of the existence of exponentially many possible -B-sequences. Is it possible, for example, 
to reduce the number of states significantly by some sort of examination of the specific 
input U-sequence? As already remarked, the existence of potentially exponentially many 
states is the major bottleneck for the efficiency of the algorithm. In the same vein, it would 
be interesting to find properties of the sequences A, B that guarantee that the number 
of aggregate states is much smaller. In a sense, this would hopefully subsume (so far, for 
the discrete and semi-continuous cases only) the earlier studies involving special classes of 
curves and/or sequences [4} [5l [13]. 



17 



5 Discussion and open problems 



We obtained an algorithm for computing the discrete Frechet distance between two sets of 
points, which runs in subquadratic time. A natural open problem that arises right away 
is whether this algorithm can be extended to compute the continuous Frechet distance 
between two polygonal curves in subquadratic time. Even solving the semi-continuous 
Frechet distance problem in subquadratic time might be interesting at this point. It is also 
interesting to know if this time bound, which is still rather close to quadratic, can be further 
reduced (see the remark at the end of the preceding section) . 
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